Data biography: Ku Klux Klan (KKK) Ledgers in the Greater Denver area¶

The Ku Klux Klan (KKK) is one of the most famous white supremacist organizations in the history of the United States, which was first founded in 1865 after the end of the Civil War, advocating to defend the interests of whites and oppose the freedom and civil rights of blacks and immigrants. The group reached the height of its expansion in the 1920s, with members infiltrating local government, law enforcement and other organizations.

Colorado in the 1920s was one of its active areas, and the core object of this dataset is the Ku Klux Klan Ledgers from Greater Denver (1924-1926). This is a historical archive that is publicly available at History Colorado, and the dataset contains detailed information on the Denver area KKK members and their associates during 1924-1926, including names, addresses, and telephone numbers etc.

Get Access To The Data


Who?¶

These ledgers were originally collected by the Ku Klux Klan in the 1920s for the organization's membership management and activity arrangements in the Denver area. At the time, the Ku Klux Klan was so powerful in Colorado that they manually entered detailed member information, such as names, addresses, and contact information, into the ledger for internal administrative use.

The physical ledger was anonymously donated to Colorado History in 1946 through a staff member of the Rocky Mountain News. And then History Colorado has preserved the ledgers and digitizing them. The museum makes the ledger public, along with its digital form, for the public to view and download on its website.

Above is a screenshot of the original ledger scanning PDF file, you can see its various records clearly displayed.

When & Where?¶

The information was collected between 1924 and 1926 by the Ku Klux Klan in and around the Greater Denver area. The ledger records reflect the organization's extensive social penetration and organizational management. In 2021, the Colorado History Museum digitized the data and made it available to the public and researchers as part of its historical archives.

In [1]:
import pandas as pd
kkk_df = pd.read_csv('kkk-ledgers-index.csv', low_memory=False)
In [3]:
import plotly.express as px
city_counts = kkk_df['residenceCity'].value_counts().nlargest(11).reset_index()
city_count = city_counts.tail(10)
fig = px.bar(
    city_count,
    x='residenceCity',
    y='count',
    title='Number of persons residing outside Denver',  
    labels={'residenceCity': 'City', 'count': 'Count'}  
)
fig.show()
In [5]:
pd.set_option('display.max_columns', None)
sample_df = kkk_df[['fullName', 'Business Address']].dropna().sample(20)
sample_df
Out[5]:
fullName Business Address
6285 Monta Joy Whittaker 900 Central Sav, 15th & Arapahoe*
12649 Howard A Stowell 17th & Stout
4192 Vance V Wilson 3625 W 32 ave
11 Chas C Anderson 1900 Broadway
4427 Howard W B Hicks 1739 Arapahoe
15024 Thos J Carroll 455 Broadway
3944 Harold A Wolfinbarger 2093 So Broadway
11947 Eugene Hiram Bell 842 Walnut
12776 Harry Hubbard Collins 1538 Market
8926 Harry Thos Sheldon 1509 Cleveland Pl
10973 Louis E Knorr 5046 Pecos
11028 Chas P Pierce 4821 Lowell Blvd
10472 Harry C Herbert 2300 Lawrence
3471 Louie E Myerson 319 Symes Bldg, 16th & Champa*
12885 Ashley B Venable 1824 Curtis
16297 Floyd W Johnson 73 So Odgen St
6256 Elmon O VanBradt 301-16th St
4643 Wm Guy Cox 34th & Fox
14480 Frank A Monroe 3545 Raleigh St
4836 Alton Teter 741 Santa Fe

The above two charts show the influence of the KKK at that time, in the Denver area alone, they had infiltrated various organizations and institutions, and expanded beyond Denver.


How?¶

These ledgers were originally recorded by manual writing and included different personal information. Since its use is primarily for internal management, this information has a great level of detail.

The post-processing process includes scanning the ledger into PDF images, using OCR for text recognition, and then manually reviewing and converting to CSV format for easy data analysis. The History Colorado provides viewing of PDF images and CSV files.

In [7]:
non_counts = kkk_df.notnull().sum().sort_values(ascending=False).tail(23).reset_index()
non_counts.columns = ['1', 'NonNullCount']
fig = px.bar(
    non_counts,
    x='1',
    y='NonNullCount',
    title='Data filling status', 
    labels={'1': '', 'NonNullCount': 'count'}  ,
     color='NonNullCount',                    
    color_continuous_scale='Inferno_r',  
    template='plotly_white'            
)
fig.show()

We can see that in addition to recording names, the amount of other member-related records in the ledger was almost halved, which shows that the KKK focused on names when recording this ledger, and addresses and other information were probably not considered.

This may also be because some people are reluctant to give out their personal information (after all, they are joining an unofficial organization). This just goes to show that the Ku Klux Klan, as a racist organization, doesn't need much information to keep its members in touch (you can even become a member just by filling in your name).


Why?¶

The original intention of the KKK to collect this data may be to manage internal organization, collect dues, and monitor social networks, hoping to strengthen organizational influence through the management of members.

In 21th Century, History Colorado has shifted its overt purposes to education, historical transparency and social reflection. By exposing these historical materials related to racism, the public can better understand the social impact of extremism and provide real and powerful material support for social education.

Description of image
Image of the Ku Klux Klan holding rallies show its iconic symbol: a burning cross

The original ledger can be seen at the Colorado History Center, and on the internet, these data are stored in two forms: one is a PDF image file, which retains the original appearance of the ledger, which is easy to historical comparison and intuitive reading. The second is CSV file, suitable for structured analysis.

Data fields include full names, addresses, phone numbers, business addresses, member numbers, ledger page numbers, and supplementary fields such as "symbolExist" and "Note & Remarks". The author uses the CSV format file provided by History Colorado, and uses Python language to read and analyze in Jupyter Notebook.

Resources accessible on the Web

The dataset has some shortcomings in record integrity, as noted on the website, with the first 69 records missing and large empty values in address and telephone information. Also, colum such as "Notes & Remarks" use many abbreviations or specific code names and may require more specialized historical context for people to understand.

The data cover only 1924-1926 and are spatially limited to the greater Denver area, so it may not be a complete picture of the Klan nationwide.

In [9]:
notes = kkk_df[['fullName', 'Notes & Remarks']].dropna().sample(n=10, random_state=42)
notes.reset_index
notes
Out[9]:
fullName Notes & Remarks
440 Harry R Miller Name struck; "DECEASED"
10059 Harry Ellsworth Meloeny Name struck; "DECEASED"
21001 Leslie E Keithline Arvada - Paid Kerk Aug. 8, 1924 #36
26731 George H Goulden Englewood to Brock 10/18
4439 Roy E Merritt Name struck; "RESIGNED 6-15-26"
28311 George A Zuber Rejected ref. 114506
22593 William C Callahan Rejected; Ref 9/18 142127
17049 John R Buckwalter Says he can't go there; Returned his own check
21491 Terry J Miller Littleton - Paid Kerk Aug. 8, 1924 #36
26372 Ames A Martin Rejected ref. 113529

As you can see, in 10 randomly selected lines, the information contained in the Notes & Remarks is almost completely unintelligible.


One interesting point in this ledger is that it has a column that counts whether there is a symbol on each member(symbolExist), which I think can be used to determine which KKK members are real and which are just related.

In [11]:
symbol_counts = kkk_df['symbolExists'].value_counts().reset_index()
symbol_counts.columns = ['Symbol', 'Count']

fig = px.pie(symbol_counts, names='Symbol', values='Count',
             title='People with Symbol')

fig.show()

According to the chart above, the number of members with symbols is only 25.4%, which may indicate that there are not that many truly fanatical members of the Ku Klux Klan, and their number is greatly overestimated.

The Ku Klux Klan was a racist, xenophobic organization with deep historical roots that became a political force in several states in the United States during the 1920s. They oppose blacks, Jews, Catholics, immigrants, gays and other marginalized groups.

When using this data, we must be wary of its political and racist tendencies and avoid inflicting secondary damage on the victims of history. And also, while the disclosure of data can be educational, it can also raise ethical and privacy concerns for the descendants of the people in the ledger. This data should be handled with respect and caution.

The madness of the KKK

This dataset is meant to help people understand the organization and operation of the Ku Klux Klan in the 1920s, and shows that the collection and publication of data was never neutral. This data presents both a piece of the history of extremist groups and an important source of information for today's society as it confronts issues of discrimination, hatred and historical justice.

People should continue to excavate the social structure hidden behind the data, strengthen the memory of historical injustice and the awareness of resistance, and make the data truly serve the goal of social progress, fairness and justice.